69 research outputs found

    Multimedia Semantic Integrity Assessment Using Joint Embedding Of Images And Text

    Full text link
    Real world multimedia data is often composed of multiple modalities such as an image or a video with associated text (e.g. captions, user comments, etc.) and metadata. Such multimodal data packages are prone to manipulations, where a subset of these modalities can be altered to misrepresent or repurpose data packages, with possible malicious intent. It is, therefore, important to develop methods to assess or verify the integrity of these multimedia packages. Using computer vision and natural language processing methods to directly compare the image (or video) and the associated caption to verify the integrity of a media package is only possible for a limited set of objects and scenes. In this paper, we present a novel deep learning-based approach for assessing the semantic integrity of multimedia packages containing images and captions, using a reference set of multimedia packages. We construct a joint embedding of images and captions with deep multimodal representation learning on the reference dataset in a framework that also provides image-caption consistency scores (ICCSs). The integrity of query media packages is assessed as the inlierness of the query ICCSs with respect to the reference dataset. We present the MultimodAl Information Manipulation dataset (MAIM), a new dataset of media packages from Flickr, which we make available to the research community. We use both the newly created dataset as well as Flickr30K and MS COCO datasets to quantitatively evaluate our proposed approach. The reference dataset does not contain unmanipulated versions of tampered query packages. Our method is able to achieve F1 scores of 0.75, 0.89 and 0.94 on MAIM, Flickr30K and MS COCO, respectively, for detecting semantically incoherent media packages.Comment: *Ayush Jaiswal and Ekraam Sabir contributed equally to the work in this pape

    Modeling Heterogeneous Statistical Patterns in High-dimensional Data by Adversarial Distributions: An Unsupervised Generative Framework

    Full text link
    Since the label collecting is prohibitive and time-consuming, unsupervised methods are preferred in applications such as fraud detection. Meanwhile, such applications usually require modeling the intrinsic clusters in high-dimensional data, which usually displays heterogeneous statistical patterns as the patterns of different clusters may appear in different dimensions. Existing methods propose to model the data clusters on selected dimensions, yet globally omitting any dimension may damage the pattern of certain clusters. To address the above issues, we propose a novel unsupervised generative framework called FIRD, which utilizes adversarial distributions to fit and disentangle the heterogeneous statistical patterns. When applying to discrete spaces, FIRD effectively distinguishes the synchronized fraudsters from normal users. Besides, FIRD also provides superior performance on anomaly detection datasets compared with SOTA anomaly detection methods (over 5% average AUC improvement). The significant experiment results on various datasets verify that the proposed method can better model the heterogeneous statistical patterns in high-dimensional data and benefit downstream applications

    Security in Process: Detecting Attacks in Industrial Process Data

    Full text link
    Due to the fourth industrial revolution, industrial applications make use of the progress in communication and embedded devices. This allows industrial users to increase efficiency and manageability while reducing cost and effort. Furthermore, the fourth industrial revolution, creating the so-called Industry 4.0, opens a variety of novel use and business cases in the industrial environment. However, this progress comes at the cost of an enlarged attack surface of industrial companies. Operational networks that have previously been phyiscally separated from public networks are now connected in order to make use of new communication capabilites. This motivates the need for industrial intrusion detection solutions that are compatible to the long-term operation machines in industry as well as the heterogeneous and fast-changing networks. In this work, process data is analysed. The data is created and monitored on real-world hardware. After a set up phase, attacks are introduced into the systems that influence the process behaviour. A time series-based anomaly detection approach, the Matrix Profiles, are adapted to the specific needs and applied to the intrusion detection. The results indicate an applicability of these methods to detect attacks in the process behaviour. Furthermore, they are easily integrated into existing process environments. Additionally, one-class classifiers One-Class Support Vector Machines and Isolation Forest are applied to the data without a notion of timing. While Matrix Profiles perform well in terms of creating and visualising results, the one-class classifiers perform poorly

    Mid-infrared plasmons in scaled graphene nanostructures

    Full text link
    Plasmonics takes advantage of the collective response of electrons to electromagnetic waves, enabling dramatic scaling of optical devices beyond the diffraction limit. Here, we demonstrate the mid-infrared (4 to 15 microns) plasmons in deeply scaled graphene nanostructures down to 50 nm, more than 100 times smaller than the on-resonance light wavelength in free space. We reveal, for the first time, the crucial damping channels of graphene plasmons via its intrinsic optical phonons and scattering from the edges. A plasmon lifetime of 20 femto-seconds and smaller is observed, when damping through the emission of an optical phonon is allowed. Furthermore, the surface polar phonons in SiO2 substrate underneath the graphene nanostructures lead to a significantly modified plasmon dispersion and damping, in contrast to a non-polar diamond-like-carbon (DLC) substrate. Much reduced damping is realized when the plasmon resonance frequencies are close to the polar phonon frequencies. Our study paves the way for applications of graphene in plasmonic waveguides, modulators and detectors in an unprecedentedly broad wavelength range from sub-terahertz to mid-infrared.Comment: submitte

    A Novel Heat Shock Transcription Factor Family in <i>Entamoeba histolytica</i>

    Get PDF
    The HSTF is a master molecule involved in the transcriptional control of several genes during different types of stress. This transcription factor is a very conserved protein identified in different organisms from bacterial to human. <i>Entamoeba histolytica</i> is the protozoan responsible for the human amoebiasis. This parasite is exposed to different kind of stress as changes in the pH, temperature, drugs, all that situations in where the parasite needs survive. Here we identified and isolated a novel gene family of HSTFs in the protozoan parasite <i>E. histolytica</i>. Three members that we called <i>Ehhstf1</i>, <i>Ehhstf2</i> and <i>Ehhstf3</i> compose this family. Amino acid alignments and domain architecture analysis revealed that the EhHSTFs presents a conserved DNA-binding domain composed of approximately 25 residues. Interestingly this domain is shorter than the domain of the human, mouse and yeast HSTFs. Heterologous antibodies recognized four peptides of 73, 66, 47 and 23 kDa in total extracts from trophozoites growth under normal conditions. The 73, 47 and 23 kDa peptides increased their intensity when the cells were growth at 42°C by 2 h. All results together demonstrate that the amoeba present HSTFs, which may be, controlled the gene expression of this parasite under different stress situations

    Neuromatch Academy: a 3-week, online summer school in computational neuroscience

    Get PDF
    Neuromatch Academy (https://academy.neuromatch.io; (van Viegen et al., 2021)) was designed as an online summer school to cover the basics of computational neuroscience in three weeks. The materials cover dominant and emerging computational neuroscience tools, how they complement one another, and specifically focus on how they can help us to better understand how the brain functions. An original component of the materials is its focus on modeling choices, i.e. how do we choose the right approach, how do we build models, and how can we evaluate models to determine if they provide real (meaningful) insight. This meta-modeling component of the instructional materials asks what questions can be answered by different techniques, and how to apply them meaningfully to get insight about brain function

    Neuromatch Academy: a 3-week, online summer school in computational neuroscience

    Get PDF

    Finishing the euchromatic sequence of the human genome

    Get PDF
    The sequence of the human genome encodes the genetic instructions for human physiology, as well as rich information about human evolution. In 2001, the International Human Genome Sequencing Consortium reported a draft sequence of the euchromatic portion of the human genome. Since then, the international collaboration has worked to convert this draft into a genome sequence with high accuracy and nearly complete coverage. Here, we report the result of this finishing process. The current genome sequence (Build 35) contains 2.85 billion nucleotides interrupted by only 341 gaps. It covers ∼99% of the euchromatic genome and is accurate to an error rate of ∼1 event per 100,000 bases. Many of the remaining euchromatic gaps are associated with segmental duplications and will require focused work with new methods. The near-complete sequence, the first for a vertebrate, greatly improves the precision of biological analyses of the human genome including studies of gene number, birth and death. Notably, the human enome seems to encode only 20,000-25,000 protein-coding genes. The genome sequence reported here should serve as a firm foundation for biomedical research in the decades ahead
    corecore